GNNGL-PPI: multi-category prediction of protein-protein interactions using graph neural networks based on global graphs and local subgraphs

Most proteins exert their functions by interacting with other proteins, making the identification of protein-protein interactions (PPI) crucial for understanding biological activities, pathological mechanisms, and clinical therapies. Developing effective and reliable computational methods for predicting PPI can significantly reduce the time-consuming and labor-intensive associated traditional biological experiments. However, accurately identifying the specific categories of protein-protein interactions and improving the prediction accuracy of the computational methods remain dual challenges. To tackle these challenges, we proposed a novel graph neural network method called GNNGL-PPI for multi-category prediction of PPI based on global graphs and local subgraphs. GNNGL-PPI consisted of two main components: using Graph Isomorphism Network (GIN) to extract global graph features from PPI network graph, and employing GIN As Kernel (GIN-AK) to extract local subgraph features from the subgraphs of protein vertices. Additionally, considering the imbalanced distribution of samples in each category within the benchmark datasets, we introduced an Asymmetric Loss (ASL) function to further enhance the predictive performance of the method. Through evaluations on six benchmark test sets formed by three different dataset partitioning algorithms (Random, BFS, DFS), GNNGL-PPI outperformed the state-of-the-art multi-category prediction methods of PPI, as measured by the comprehensive performance evaluation metric F1-measure. Furthermore, interpretability analysis confirmed the effectiveness of GNNGL-PPI as a reliable multi-category prediction method for predicting protein-protein interactions.


Introduction
Protein-protein interactions (PPI) play a crucial role in various biological processes within cells.Identifying PPI is of great significance in advancing research across multiple life science fields, including medical diagnosis, drug design, and disease treatment [1].Currently, the methods for identifying PPI can be broadly categorized into traditional biological experimental methods and computational methods.Traditional biological experimental methods primarily involve techniques such as yeast two-hybrid [2], protein chips [3], and synthetic lethal analysis [4].However, these methods suffer from several disadvantages, including being time-consuming, laborintensive, and requiring high financial resources [5].To overcome these disadvantages, the computational methods for PPI prediction have developed rapidly.Nevertheless, these computational methods face dual challenges.Firstly, they need to accurately identify multiple specific categories of PPI.Secondly, they must achieve the desired predictive performance.
In recent years, the computational methods for predicting PPI have transitioned from docking-based methods to machine learning and deep learning-based methods.Docking-based methods [6] are capable of effectively predicting PPI, but they require high-quality protein 3D structures and significant computational resources.Additionally, they operate at a slower prediction speed, which cannot keep pace with the demands of processing massive data.Machine learning and deep learning methods, on the other hand, have exhibited better performance in handling large amounts of data [7][8][9][10].These methods can utilize both protein sequence and structural data.Machine learning-based methods [9,11,12] rely on protein sequence or structural features, utilizing models like SVM [10,13,14] and random forest [15] to predict PPI.Although these methods have displayed good predictive performance, they cannot automatically extract deep-level features of PPI from the original sequences or structures of proteins.This creates bottlenecks that hamper improvements in predictive performance.However, the emergence of deep learning models, such as multi-layer neural networks, has provided better model prediction performance and pointed researchers towards breakthroughs in addressing the performance bottlenecks encountered by machine learning techniques [16].
Methods for predicting PPI using deep learning techniques leverage protein sequences, structures, and PPI networks.Deep learning techniques such as deep neural networks (DNN) [17], convolutional neural networks (CNN) [18], recurrent neural networks (RNN) [19], attention mechanisms [20], and graph neural networks (GNN) [21] are employed to extract deep-level features from proteins and PPI networks.DNN-based methods extract protein features through multi-layer neural networks to directly predict PPI [22,23] or employ machine learning models for PPI prediction [24].CNN and RNN-based methods focus on extracting local features and long-range dependency features from protein sequences, respectively.For instance, LSTM-PHV method [8] and other related approaches [25] leveraged the LSTM (Long Short-Term Memory) model to capture long-range dependency features in protein sequences.Methods like ADH-PPI [26] and DCSE [27] integrated CNN and RNN to extract both local and long-range features from protein sequences, which were then combined to predict PPI.Attention mechanisms have also been widely utilized to identify key sequence features in protein sequences [23,26,28].While these DNN, CNN, RNN, and attention-based methods primarily focused on protein sequence features, they often overlooked the structural features of proteins and the hidden interaction features present in PPI networks.To address these limitations, GNN-based methods have emerged.However, incorporating protein structures into these methods has been challenging due to the slow exploration of protein 3D structures.With the advent of protein 3D structure prediction tools like AlphaFold [29] and ColabFold [30], obtaining monomer protein 3D structures has become easier.This has led to rapid advancements in research that utilizes GNN to extract protein structural features for predicting PPI, either independently or in combination with protein sequence features.For instance, the method proposed in reference [31] directly employed the GCN/GAT model to extract the structural features from two interacting proteins.These features were then concatenated and used to predict PPI.TAGPPI method [32] combined TextCNN and GAT to extract sequence and structural features from protein sequences and contact maps, respectively.The extracted features were fused before being passed through a fully connected (FC) layer to predict PPI.Other methods introduced interaction features by constructing a PPI network graph where proteins served as vertices and interactions served as edges.GNN was then used to extract interaction features from this graph, ultimately predicting PPI.Methods such as S-VGAE [33], HIGH-PPI [34], and Topsy-Turvy [35] utilized PPI network graphs along with protein sequences or structures as vertex features.GNN was applied for binary classification to determine whether there was an interaction between vertices in the graph.The aforementioned methods treated PPI as a binary classification task and did not identify specific interaction categories between proteins.However, methods like PIPR [36], GNN-PPI [37], SemiGNN-PPI [38], and AFTGAN [39] have been developed to predict PPI into multi-category.PIPR utilized a Siemens residual current convolutional neural network (RCNN) to extract local features and contextual information from protein sequences.GNN-PPI, SemiGNN, and AFTGAN leveraged GNN to learn deep-level features from PPI networks, enabling multi-category prediction.Although these methods extracted protein sequence features and PPI network graph features for multi-category PPI prediction, research in this area was still in its early stages.The performance of the models was somewhat below user expectations.Therefore, proposing an efficient and effective model for multi-category prediction of PPI presents significant challenges.
In this study, we proposed a novel graph neural network method, called GNNGL-PPI, for predicting multicategory of protein-protein interactions.GNNGL-PPI utilized a combination of global graph and local subgraph features.Specifically, we used Graph Isomorphism Network (GIN) to extract global graph features from PPI network graphs.Additionally, we employed GIN-AK to extract local subgraph features from subgraphs containing protein vertices in the PPI network graphs.Simultaneously, to address the issue of imbalanced samples for each category in the benchmark dataset, we introduced an Asymmetric Loss (ASL) Function.This loss function helped enhance the predictive performance of the model by assigning different weights to different categories based on their prevalence.We evaluated the model on six standard test sets created using three different dataset partitioning algorithms (Random, BFS, DFS).GNNGL-PPI consistently outperformed the state-of-the-art methods for multi-category prediction of PPI, as measured by the comprehensive evaluation metric F1-measure.Furthermore, we conducted interpretability analysis to validate the effectiveness of GNNGL-PPI.Experimental results confirmed that GNNGL-PPI was a reliable method for predicting multi-category of PPI.

Datasets
In this study, we approached protein-protein interactions prediction as a multi-category task, encompassing seven PPI categories: Reaction, Binding, Post-translational modification (Ptmod), Activation, Inhibition, Catalysis, and Expression.Each proteinprotein interaction pair is assigned to at least one of these categories.For example, the interaction category between protein 9606.ENSP00000005257 (RalA) and protein 9606.ENSP00000202677 (RalGAP A2) is Inhibition (PMID: 34767674), while the interaction category protein 9606.ENSP00000003100 (Cyp51) and protein 9606.ENSP00000240055 (NF-YB) is Activation (PMID:27438,727).To evaluate the performance of the model, we utilized two benchmark datasets, SHS27k and SHS148k, which were consistent with the dataset used in the GNN-PPI [37] method and exhibited sequence consistency of less than 40% in each dataset.These datasets consisted of 7624 (1690 proteins) and 44,488 (5189 proteins) protein-protein interaction pairs, respectively.The number and occupation ratio of samples for the seven PPI categories generated by these protein-protein interaction pairs were shown in Table 1.It was evident from Table 1 that the number of samples for the seven PPI categories was imbalanced.During the training and testing of the model, we also employed the Random, Breath First Search (BFS), and Depth First Search (DFS) algorithms proposed by the GNN-PPI method to partition the SHS27k and SHS148k datasets.For detailed information on the data partitioning algorithms for the Random, BFS, and DFS, please refer to the GNN-PPI method.

PPI network graph formation and protein features PPI network graph formation
Assuming a set of proteins P={p_1,p_2,…,p_n } with n as the number of proteins, each protein acted as a vertex in the PPI network graph.The interaction category between p_i and p_j was represented as edge e_ij (1≤i,j≤n).Different protein-protein interaction categories made up the category space D={D_1,…,D_t } (t=7) of the dataset, where t is the number of PPI categories.If there was an interaction between p_i and p_j of a certain category, the corresponding position in the adjacent matrix representing the PPI network graph was assigned a value of 1, otherwise, it was assigned a value of 0.

Protein features
We employed the pre-trained model MASSA [40] to capture high-level and more fine-grained features.This pre-trained model leveraged multi-modal protein data, including protein sequences, structures, gene ontology annotations, motifs, and region positions, to derive comprehensive protein features.Unlike previous research, which mainly focused on protein features derived from protein sequences such as Position-Specific Scoring Matrix (PSSM) [41] or Hidden Markov Models (HMM) [42] matrix, or treated protein sequences as natural language processing (NLP) [43] tasks to extract protein sequence features.However, these researches yielded relatively singular protein features and did not fully encompass other biochemical information pertaining to proteins.

Proposed model
To begin with, we input the multimodal data of proteins, such as sequences, structures, and Go annotations, into the pre-trained model MASSA and obtained the 512-dimensional protein pre-training features.Next, we embedded the pre-trained features using a layer of linear transformation.Finally, we used the embedded features as the protein vertex features in the PPI network (Fig. 1A).
The extraction of high-level features in the PPI network graph can be divided into two main parts, as shown in Fig. 1B and C: global graph features extraction and local subgraph features extraction.In the global graph features extraction part, we utilized one layer of GIN followed by two FC layers.ReLU activation function [44] was applied in the FC layers.To ensure stable model training, batch normalization function was employed in the last layer to normalize the extracted features and obtained the global graph features of the PPI network graph.Moving on to the local subgraph features extraction section: firstly, we selected a protein vertex as the center and included all vertices within a path distance of K from that the center vertex.This formed a subgraph of the protein.Secondly, we used the GIN-AK to extract the features from the subgraph.GIN-AK incorporated two distinct processes to extract global features and centroid features of the subgraph, respectively.These features were then combined to obtain the final local subgraph features.To extract the global features of the subgraph, we input both the subgraph and protein features into a single layer of GIN.The output features of GIN were then fed into a FC layer (gating unit) with a sigmoid activation function, resulting in the global features of the subgraph.For the centroid features of the subgraph, we first calculated the path distance between the vertices in the subgraph.The obtained path distance matrix was then processed using the same gating unit to obtain the centroid features.In order to predict multi-category of PPI, we first multiplied the features of the protein itself and its interacting protein.The resulting multiplied features were then fed into a FC layer.The output of the FC layer was a 7-dimensional matrix.Finally, by applying the sigmoid function to the 7-dimensional matrix, we completed the multi-category prediction of PPI.

GIN-AK
GNN have proven to be an effective framework for topology representation learning.GIN is widely regarded as the most repressive GNN [45], but it still faces limitation in breaking through the first-order Weisfeiler-Leman isomorphism testing [46,47].To overcome this limitation, a subgraph operation had been introduced based on GIN.This approach updated vertex information by using the feature of the subgraph of vertex, rather than relying solely on the neighboring vertex feature information.This transformation reduced the complexity of the graph feature problem to a smaller and simpler subgraph feature problem.The resulting GIN model was called GIN-AK (GIN As Kernel).In the process of updating vertices using graph convolutional operations, each vertex aggregates information directly from neighboring vertices in a star operation.The star operation Star (v) forms a graph that can be defined as formu- lar (1).The specific update method for vertices in GIN is shown in formula 2.
The limitation of graph convolutional operations lies in their inability to differentiate between graphs that possess the same vertex degree but exhibit different structures.In order to address this issue, we proposed the utilization of a subgraph encoder, denoted as , to substitute the Star (v) operation.This replacement significantly enhanced the expressive capabilities of the graph convolutional operation.The subgraph encoder G [N k (v)] facilitated the update of vertex features by constructing subgraph features within a k -hop egonet centered around the vertex v .This update process was displayed in formula (3), which specifically outlined the procedure for updating vertices in GIN-AK.
In GIN-AK, a k -hop propagation mechanism was employed to acquire the subgraph surrounding each vertex.Additionally, the path distance between each vertex and the centroid of its corresponding subgraph was computed.This information enhanced the vertex features and contributed to the overall improvement of graph convolutional expressiveness.To capture the nonlinear transformations of global features and centroid features of the subgraph, gating units were introduced.Finally, the global features and centroid features of the subgraph were obtained using formula ( 4) and ( 5), respectively.
Among them, d (l+1) v|j represented the path distance matrix from vertex v to vertex j in the l +1-layer, and represented element wise multiplication.Combining the global features and centroid features of the subgraph, the update process of vertex h (l+1) v was shown in formula (6).
GIN-AK improved the expressiveness of graph convolutional operation through the use of subgraphs instead of star graphs.This enhancement allowed GIN-AK to capture underlying structural features in PPI network, leading to the improved performance in predicting different categories of protein-protein interactions.

ASL
In this study, PPI prediction was regarded as a multi-category classification task.However, the dataset (Table 1) used was imbalanced, which means that there were more samples corresponding to some categories compared to others.This can affect the performance evaluation of the model since it might focus more on the majority classes and ignore the minority classes.To address this problem, we introduced the Asymmetric Loss (ASL) [48] function.
In multi-category imbalanced datasets, using symmetric loss functions like Focal Loss [49], BCE Loss [50], or Cross Entropy [51] may not effectively learn some features from positive samples.These loss functions tend to focus more on negative samples than positive samples [48], which can be suboptimal.For example, in Focal Loss function (formula 8), when using the same γ for multi- category training, it may eliminate the gradient of sparse positive samples.Among them, p = σ (z) is the output probability of the model, γ is the focusing parameter, L + and L − represent the positive and negative loss parts, respectively.
Therefore, in ASL, the focusing parameters of the loss function are decoupled into positive focusing parameter γ + and negative focusing parameter γ − .This allows for asymmetric focusing, enabling better control the effect of positive and negative samples on the loss function.In addition, considering the asymmetric focusing parameters in ASL, there are shortcomings in some cases where there are not enough negative samples.In ASL, another asymmetric mechanism of probability translation is further proposed, as shown in formula (9).
m represents an adjustable probability margin.Through the above two adjustments, the final definition of ASL is shown in formula (10).
By employing ASL to dynamically regulate the degree of asymmetry throughout the entire training process, the selection of hyperparameters is simplified.This effectively balances the focus of the network on positive and negative samples, ultimately improving the accuracy of multi-category prediction.

Model training
We trained our model for 400 epochs using the Adam optimizer [52] with a batch size of 1024 and an initial learning rate of 0.01.To optimize the training process, we implemented a learning rate decay function, set the patience to 20, and stopped training when the model did not reduce its loss after 20 epochs.To prevent overfitting, we used a dropout rate of 0.2 in the local subgraph features extraction part and 0.5 in the global graph features extraction section.In the local subgraph features extraction part, we set k in k -hop to 1, extracted the 1-hop subgraph as the input graph of GIN-AK, and used ASL with a probability margin m of 0.05.For asymmetric focusing, we set γ + to 1 and γ − to 0, as recommended by ASL.

Evaluation metrics
In this study, we regarded PPI prediction as a multi-category classification task.The dataset used in our study had an imbalanced distribution of samples across different categories.To effectively evaluate the model's performance on this imbalanced dataset, we employed F1-measure as the evaluation metric.F1-measure is a widely-used evaluation metric for imbalanced datasets because it takes into account both precision and recall, providing a balanced measure of model performance [53].

precision = T P T P +
Among them, TP, FP, TN, and FN represent the number of predicted true positive, false positive, true negative, and false negative samples, respectively.

Multimodal pre-training features offer a better representation of protein
The predictive performance of a model is directly influenced by the quality of its input features.In this study, we aimed to select high-quality features for protein vertices in the PPI network.To achieve this, we compared the sequence features of proteins with features obtained from the latest multimodal pre-training model of proteins.structures, gene ontology annotations, motifs, and region positions to extract comprehensive protein features at a higher level.Subsequently, we constructed a GIN model and used the protein sequence features as well as the multimodal pre-training features as vertex features in the PPI network graph to predict PPI.

Experimental results presented in
In the case of the SHS27K dataset, the GIN model based on multimodal pre-trained features exhibited the improved performance under three different dataset partitioning algorithms (Random, BFS, and DFS).Specifically, the F1-measure increased by approximately 0.7%, 7%, and 4%, respectively.As for the SHS148K dataset, the GIN network based on multimodal pre-trained features performed improvements of 1%, 4%, and 1% under the Random, BFS, and DFS algorithms, respectively.These experimental findings indicated that when using graph neural networks for predicting PPI, multimodal pretrained features offered a better representation of protein compared to sequence features alone.

Local subgraph features can enhance the performance of the model
In this study, we proposed a novel operation for updating vertex information in GIN.Instead of directly aggregating information from neighboring vertices in a star operation, we updated the vertex information through the features of the vertex's subgraph.To assess the contribution of local subgraph features to the model's performance, we conducted ablation experiments based on global graph features, local subgraph features, and their combined features.The global graph features were extracted directly using the GIN model, while the local subgraph features were extracted using the GIN-AK.The combined features were obtained by fusing global graph features and local subgraph features.Experimental results presented in Table 3 indicated that, based solely on global graph features and local subgraph features, the latter performed approximately 5% better than the former for the BFS partitioning algorithm on both SHS27K and SHS148K datasets, while they exhibited similar performance under the other two partitioning algorithms.However, after combining global graph features and local subgraph features, the combined model showed better performance than the global graph features-based and local subgraph featuresbased models under the two partitioning algorithms (Random and DFS) of the SHS148K dataset and all partitioning algorithms of the SHS27K dataset.Although the F1-measure of the combined model was 1% lower than that of the local subgraph features-based model under the BFS partitioning algorithm of the SHS148K dataset, these findings still showed that incorporating local subgraph features on the basis of global graph features could enhance the model's performance.

Selection of parameter k in k-hop
To select the most effective parameter k in k-hop, we conducted experiments on the SHS27K dataset using both k values of 1 and 2. Experiment results, as presented in Table 4, showed that a k value of 1 outperformed the alternative.Due to the significant computational cost associated with calculating subgraphs for all vertices, we decided against conducting further experiments on the SHS27K dataset where the k value was greater than or equal to 3. Additionally, given the large size of the SHS148K dataset, it was not feasible to perform subgraph calculation experiments with a k value greater than or equal to 2 on our device.As a result, we ultimately settled on a k value of 1.

The impact of ASL on the performance of the model
In the previous experiments, we utilized the BCE loss function, which is a type of symmetric loss function that does not address the issue of imbalanced sample sizes with different interaction categories in the dataset.To address this limitation, we introduced the ASL and conducted some experiments using it.Experimental results presented in Table 5 revealed that, under the three partitioning algorithms (Random, BFS, and DFS) on both SHS27K and SHS148K datasets, the model based on ASL achieved higher F1-measure than the model based on the BCE loss function.Specifically, in the SHS27K dataset, F1-measure increased by approximately 0.20%, 2%, and 0.15%, respectively, while in the SHS148K dataset, F1-measure increased by approximately 1%, 1%, and 2%, respectively.These results illustrated that utilizing the ASL can effectively enhance the performance of the model on imbalanced datasets.

Performance comparison of GNNGL-PPI with the state-ofthe-art methods
In this study, GNNGL-PPI exhibited good performance across three different partitioning algorithms on the SHS27K and SHS148K datasets.To comprehensively assess its performance, we conducted a comparison with two machine learning methods, including Random Forest (RF) and Logistic Regression (LR), as well as four deep learning methods like PIPR, GNN-PPI, LDMGNN [54], and SemiGNN-PPI.Experimental results (Table 6) showed that GNNGL-PPI consistently outperformed these state-of-the-art methods in terms of F1-measure under all three partitioning algorithms on both datasets.These findings affirmed the reliability of GNNGL-PPI as a PPI predictor.

Performance comparison between GNNGL-PPI and state-of-the-art methods under the multi-classification evaluation metrics
In addition to using binary classification evaluation metrics, we further utilized multi-classification evaluation metrics such as Weighted precision, Weighted recall, Weighted f1-score, Macro precision, Macro recall, Macro f1-score, Micro precision, Micro recall, and Micro f1-score to evaluate the model's performance.Experimental results (Table 7) showed that GNNGL-PPI exhibited superior performance compared to other state-of-the-art methods.These outcomes highlighted the effectiveness of GNNGL-PPI in predicting the multicategory of PPI.

Performance comparison of some statistical tests between GNNGL-PPI and state-of-the-art methods
In this study, we dealt with an imbalanced dataset where the number of samples across different categories varied.To accurately evaluate the model's performance on this imbalanced dataset, we opted for the F1 metric as it provided a more accurate reflection of the model's true performance.Consequently, we conducted some statistical tests on the F1 metric of three methods including GNNPPI, LDMGN, and GNNGL-PPI using three distinct partitioning algorithms on both SHS27K and SHS148K datasets.Experiment results, as shown in Table 8, indicated that GNNGL-PPI exhibited slightly superior performance across two statistical testing metrics compared to two other state-of-the-art methods.

Statistical analysis of prediction accuracy of GNNGL-PPI and GNN-PPI under different interaction categories
Table 6 showed that the GNNGL-PPI method outperformed other state-of-the-art methods for comparison under three different partitioning algorithms on both the SHS27K and SHS148K datasets.To further evaluate the predictive performance of GNNGL-PPI across different interaction categories, we compared its prediction accuracy with the GNN-PPI method under the BFS partitioning algorithm using the SHS27K dataset (Fig. 2).Statistical analysis revealed that for the interaction category Reaction with 3164 samples (Table 1), after being divided by the BFS algorithm, 613 test samples were obtained.GNNGL-PPI correctly predicted 496 of the 569 samples it identified, yielding an accuracy of 80.91% (Fig. 2a).In comparison, GNN-PPI correctly predicted 434 of the 510 samples, achieving an accuracy of 70.79%.Similarly, for the Binding, Ptmod, Activation, Inhibition, Catalysis, and Expression categories, GNNGL-PPI achieved a higher accuracy (within parentheses) than that of GNN-PPI (85.19%, 71.55%), (86.02%, 71.55%), (84.35%, 81.42%), (86.67%, 85.74%), (87.47%, 74.94%), and (42.55%, 28.72%), respectively (Fig. 2b).In all interaction categories except for Action and Inhibition, GNNGL-PPI exhibited classification accuracy that was more than 10% higher than that of GNN-PPI.Notably, under the Expression category, there were only 94 test samples.Despite this,  GNNGL-PPI and GNN-PPI correctly predicted 27 and 40 samples, respectively, with an accuracy improvement of 13.83%.This suggested that our proposed GNNGL-PPI method could learn features from small sample sizes and made accurate predictions, mitigating the problem of low prediction accuracy due to imbalanced dataset categories with small sample sizes.

Interpretability analysis of the effectiveness of GNNGL-PPI
To thoroughly evaluate the effectiveness of GNNGL-PPI, we applied the widely recognized dimensionality   reduction algorithm t-SNE [55] to the SHS27K dataset.This dataset was partitioned using the Random alogrithm.t-SNE proved to be an ideal choice for our analysis as it preserved the proximity of closely positioned data points even after reducing their dimensions.At the same time, it effectively maintained the separation between originally distant data points.We utilized t-SNE to gain insights into the effectiveness of GNNGL-PPI by visualizing the clustered representations resulting from dimensionality reduction at different epochs of model training.Due to the rapid convergence of GNNGL-PPI in the early stages of training, we applied t-SNE to reduce the dimensionality of features learned in the 1st, 5th, 20th, 50th, 100th, and 200th epochs.The clustering visualization obtained after one epoch of model training (Fig. 3a) revealed that data points representing different interaction categories were intertwined, with no distinct separation between them.However, as the number of training epochs increased, the clustering visualizations after the 5th, 20th, 50th, and 100th epochs (Fig. 3b to e) clearly showed that data points representing different interaction categories gradually moved away from each other, while data points representing the same interaction category started to cluster together.By the 200th epoch, clear boundaries between data points of different interaction categories were observed (Fig. 3f ).It's important to note that since this study treated PPI as a multi-category classification task, some data points from different interaction categories may still appear in the same cluster.Nevertheless, the clustering visualization results from different training epochs indicated that GNNGL-PPI was an effective method for predicting multi-category of PPI.

Discussions
Although GNNGL-PPI has exhibited a fine performance in predicting multiple categories of protein-protein interactions and can provide explanatory analysis for its effectiveness, it also has certain shortcomings: 1.The training of GNNGL-PPI was conducted on two publicly available benchmark datasets, and the model's learned deep features were limited.This limitation may result in a gap between the generalization performance of GNNGL-PPI and users' expected thresholds.2. GNNGL-PPI relied on sequence features and GO annotations of proteins, while overlooking protein structural features.However, protein structural features often play a crucial role in the performance of PPI models.3. The utilization of GNNGL-PPI requires users to input protein-protein interaction networks, which undoubtedly increases the complexity for nonprofessionals and restricts the application and promotion of GNNGL-PPI.
In response to these shortcomings, we have carefully considered the issues and proposed potential solutions: (1) To address the limited deep features learned by GNNGL-PPI, we suggest crawling a more comprehensive PPI interaction dataset from databases.By constructing a more extensive PPI interaction network for training GNNGL-PPI, we can gradually enhance its generalization performance.the exception of the Action and Inhibition interaction categories, the prediction accuracy of both methods was similar.However, in the remaining five interaction categories, GNNGL-PPI exhibited a prediction accuracy that was more than 10% higher than that of GNN-PPI.
(2) In order to incorporate protein structural features into GNNGL-PPI, we recommend extracting such features from protein topology graphs or threedimensional grids using graph neural networks or 3D-CNN.This approach will enrich the protein features and improve the model's performance.(3) To alleviate the challenges faced by non-professionals in utilizing GNNGL-PPI, we propose developing an online service tool.With this tool, users would only need to input protein pairs, and the tool would automatically construct a protein-protein interaction network and predict the interaction categories between proteins.
In order to further improve the performance of GNNGL-PPI, we believe that in-depth research should be conducted in the following areas in the future: (a) Introducing self-supervised contrastive learning into the GNNGL-PPI method enhances its feature extraction ability, enabling the method to learn more key implicit features that affect multi-category prediction of protein-protein interactions.(b) At present, the protein features used are still relatively single, and multiple modal information such as protein sequences and structures should be integrated to promote the model to achieve better performance based on the rich features of proteins.(c) Deepening the interpretive analysis of the effectiveness of protein-protein interaction prediction methods will not only promote the application of this method, but also provide support for the research of new methods.
GNNGL-PPI not only provides PPI prediction services to support users in understanding the mechanism of protein-protein interaction, but also the predicted PPI categories or PPI networks constructed from them can provide deep and rich features for drug-target interaction prediction and other related tasks, which is helpful for drug screening, repositioning, and target recognition research.We will briefly provide relevant work in Appendix A.

Conclusions
GNNGL-PPI is a novel method for predicting multi-category of PPI based on graph neural networks.This innovative approach leverages the power of GIN and GIN-AK to extract the features from both the global graph and local subgraph of the PPI network.Additionally, the use of ASL helps enhance the model's performance on imbalanced datasets.Experimental results have showed that GNNGL-PPI surpasses the performance of existing state-of-the-art methods for PPI prediction on two standard datasets.The effectiveness of GNNGL-PPI is further supported by the t-SNE algorithm, which provides visual evidence of the model's capability.Therefore, GNNGL-PPI is a reliable multi-category prediction tool for protein-protein interactions.
Finally, we concatenated the global features and centroid features of the subgraph, applied batch normalization, and obtained the local subgraph features.The global graph features and local subgraph features were concatenated to form the vertex features in the PPI network graph.It was important to note that ReLU activation function was used in the gating unit during the subgraph features extraction

Fig. 1
Fig. 1 The architecture review of GNNGL-PPI.(A) We utilized the pre-training model MASSA to obtain the comprehensive protein features based on multimodal data such as protein structures, sequences, and gene ontology annotations.(B) GIN-AK extracted the global features and centroid features of subgraphs through two different processes, and then added these extracted features to obtain the final local subgraph features.(C) The extracted global features from the PPI network graph through GIN and the extracted local subgraph features by GIN-AK were concatenated to obtain the high-level features of vertices in the PPI network, i.e., protein features.We multiplied the features of proteins themselves and their interacting proteins and sent them into the multilayer perceptron (MLP) to complete multi-category prediction of protein-protein interactions GNNPPI < GNNGLPPI (b) GNNPPI > GNNGLPPI (c) GNNPPI = GNNGLPPI (d) LDMGN < GNNGLPPI (e) LDMGN > GNNGLPPI (f ) LDMGN = GNNGLPPI Statistic tests Wilconxon sign rank test (b) Based on positive rank

Fig. 2
Fig.2Performance comparison between GNNGL-PPI and GNN-PPI across different interaction categories using the BFS partitioning algorithm on the SHS27K dataset.(a) In each interaction category, GNNGL-PPI outperformed GNN-PPI by predicting more correct samples (green color) as compared to the latter (red color).Particularly in the Expression category, GNNGL-PPI correctly predicted 40 positive samples, while GNN-PPI only achieved 27.(b) With the exception of the Action and Inhibition interaction categories, the prediction accuracy of both methods was similar.However, in the remaining five interaction categories, GNNGL-PPI exhibited a prediction accuracy that was more than 10% higher than that of GNN-PPI.

Fig. 3
Fig. 3 Interpretability analysis of the effectiveness of GNNGL-PPI.We used t-SNE algorithm to elucidate GNNGL-PPI's effectiveness by contrasting the clustered visualizations resulting from dimensionality reduction for multiple different epochs of the model training of GNNGL-PPI.The clustering visualization results, represented as (a) to (f), corresponded to the features learned in the first, fifth, 20th, 50th, 100th, and 200th epochs of model training, respectively.Upon examining these results, a clear pattern emerged: data points representing different interaction categories gradually moved apart as the number of training epochs increased, while data points representing the same interaction category gradually clustered together.The clustering visualization results showed that GNNGL-PPI was an effective PPI multi-category predictor

Table 1
Number and occupation ratio of samples for seven categories of protein-protein interactions

Table 2
Performance comparison between multimodal pre-training features and sequence features

Table 3
Performance comparison of global graph, subgraph, and their combined features

Table 4
Performance comparison of different k values based on SHS27K

Table 5
Performance comparison of asymmetric (ASL) and symmetric (BCE) loss functions

Table 6
Performance comparison of GNNGL-PPI with the state-of-the-art methods

Table 7
Performance comparison between GNNGL-PPI and other state-of-the-art methods under the multi-classification evaluation metrics

Table 8
Performance comparison of some statistical tests between GNNGL-PPI and state-of-the-art methods